Generating Component-based Supervised Learning Programs From Crowdsourced Examples

نویسندگان

  • Jose Cambronero
  • Martin Rinard
  • José Cambronero
چکیده

We present CrowdLearn, a new system that processes an existing corpus of crowdsourced machine learning programs to learn how to generate e�ective pipelines for solving supervised machine learning problems. CrowdLearn uses a probabilistic model of program likelihood, conditioned on the current sequence of pipeline components and on the characteristicsof the inputdata to thenextcomponent in thepipeline, to predict candidate pipelines. Our results highlight the e�ectivenessof this technique in leveragingexistingcrowdsourced programs to generate pipelines that work well on a range of supervised learning problems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Predicting a Correct Program in Programming By Example

We study the problem of efficiently predicting a correct program from a large set of programs induced from few input-output examples in Programming-by-Example (PBE) systems. This is an important problem for making PBE systems usable so that users do not need to provide too many examples to learn the desired program. We first characterize the three main types of expressions used for expression s...

متن کامل

Predicting a Correct Program in Programming by Example

We study the problem of efficiently predicting a correct program from a large set of programs induced from few input-output examples in Programming-byExample (PBE) systems. This is an important problem for making PBE systems usable so that users do not need to provide too many examples to learn the desired program. We first formalize the two classes of sharing that occurs in version-space algeb...

متن کامل

Wised Semi-Supervised Cluster Ensemble Selection: A New Framework for Selecting and Combing Multiple Partitions Based on Prior knowledge

The Wisdom of Crowds, an innovative theory described in social science, claims that the aggregate decisions made by a group will often be better than those of its individual members if the four fundamental criteria of this theory are satisfied. This theory used for in clustering problems. Previous researches showed that this theory can significantly increase the stability and performance of...

متن کامل

Wised Semi-Supervised Cluster Ensemble Selection: A New Framework for Selecting and Combing Multiple Partitions Based on Prior knowledge

The Wisdom of Crowds, an innovative theory described in social science, claims that the aggregate decisions made by a group will often be better than those of its individual members if the four fundamental criteria of this theory are satisfied. This theory used for in clustering problems. Previous researches showed that this theory can significantly increase the stability and performance of...

متن کامل

Improving Quality of Crowdsourced Labels via Probabilistic Matrix Factorization

Quality assurance in crowdsourced annotation often involves having a given example labeled multiple times by different workers, then aggregating these labels. Unfortunately, the worker-example label matrix is typically sparse and imbalanced for two reasons: 1) the average crowd worker judges few examples; and 2) few labels are typically collected per example to reduce cost. To address this miss...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017